Skip to content

proxy: support setting proxy via --proxyServer, PROXY_SERVER env var or PROXY_HOST + PROXY_PORT env vars #589

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jun 10, 2024

Conversation

ikreymer
Copy link
Member

@ikreymer ikreymer commented May 30, 2024

fixes #587

The proxy env vars PROXY_HOST and PROXY_PORT were being ignored, as they were hardcoded to obsolete values in the Dockerfile.

Proxy settings can now be set, in order of precedence via:

  • --proxyServer cli flag
  • PROXY_SERVER env var
  • PROXY_HOST and PROXY_PORT env vars, which set an HTTP proxy server only (for backwards compatibility with 0.12.x)

The --proxyServer / PROXY_SERVER settings are passed to the browser via the --proxy-server flag.
AsyncFetcher / direct fetch also supports HTTP and SOCKS5 proxying.
Supported proxies are: HTTP no auth, SOCKS5 no auth, SOCKS5 with auth (supported in Brave, but not Chrome!)

… set (to be compatible with 0.12.x)

accidentally always ignored proxy settings before
fixes #587
@ikreymer ikreymer requested a review from tw4l May 30, 2024 04:37
@ikreymer ikreymer changed the title proxy: fix proxy settings when PROXY_HOST and PROXY_PORT env vars are set proxy: support setting proxy via --proxyServer, PROXY_SERVER env var or PROXY_HOST + PROXY_PORT env vars May 30, 2024
@tw4l
Copy link
Member

tw4l commented May 30, 2024

Might involve a bit of effort to spin up a local proxy server, but would be great to add some tests with this PR to ensure the feature is working correctly moving forward.

@vnznznz
Copy link
Contributor

vnznznz commented May 30, 2024

Might involve a bit of effort to spin up a local proxy server, but would be great to add some tests with this PR to ensure the feature is working correctly moving forward.

The socks proxy feature for browsertrix will also depend on this

@vnznznz
Copy link
Contributor

vnznznz commented May 30, 2024

The AsyncFetcher in recorder.ts doesn't handle proxy settings as well. As nodejs doesn't seem to support setting a global proxy, it might make sense to just add support for socks proxy like we had with pywb?

ikreymer added 8 commits June 5, 2024 14:52
tests: add proxy tests for socks5 and http!
- support for SOCKS5 (as supported in Brave though not Chromium) but not HTTP (not supported in any browser w/o interactive prompt)
- tests: update tests to check socks5/http and html/pdf in a loop
@ikreymer
Copy link
Member Author

ikreymer commented Jun 6, 2024

The AsyncFetcher in recorder.ts doesn't handle proxy settings as well. As nodejs doesn't seem to support setting a global proxy, it might make sense to just add support for socks proxy like we had with pywb?

Now fixed, supporting http and socks proxy settings for both browser and direct fetch via async fetcher.
The SOCKS5 proxy also supports auth, since it is supported in Brave (though not Chrome!) with --proxy-server flag.
The HTTP proxy does not support auth, as that's not supported in --proxy-server flag (always results in an interactive prompt)

@ikreymer
Copy link
Member Author

ikreymer commented Jun 6, 2024

Might involve a bit of effort to spin up a local proxy server, but would be great to add some tests with this PR to ensure the feature is working correctly moving forward.

Tests added in proxy.test.js using https://hub.docker.com/r/tarampampam/3proxy/ which supports both HTTP and SOCKS5 proxying!

@ikreymer ikreymer requested a review from vnznznz June 6, 2024 16:40
Copy link
Member

@tw4l tw4l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, just one minor logging suggestion.

Tested locally with 3proxy and with a remote paid socks5 proxy (as part of a service I'm already using), and worked as expected with both. Nice work! Really appreciate the thorough test coverage too.

Comment on lines +34 to +37
for (const scheme of ["socks5", "http"]) {
const port = scheme === "socks5" ? SOCKS_PORT : HTTP_PORT;

for (const type of ["HTML page", "PDF"]) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice homegrown parametrization here

@ikreymer ikreymer merged commit e2b4cc1 into main Jun 10, 2024
4 checks passed
@ikreymer ikreymer deleted the fix-proxy branch June 12, 2025 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add SOCKS5 + HTTP Proxy support (and add backwards compatible support for 0.12.x support in Browsertrix Crawler 1.x)
3 participants